Week 06: Cool Type Checking & Runtime Organization
Cool Type Checking
Static vs. Dynamic Typing
- Static type systems detect common errors at compile time.
- But some correct programs are disallowed.
- Some argue for dynamic type checking instead.
- Others want more expressive static type checking.
- Def.
- The dynamic type of an object is the class C that is used in the
new C expression that created it at run-time.
- The static type of an expression captures all dynamic types the expression could have at compile-time.
Soundness Theorem
- Target
- For all expressions $E$,
dynamic_type(E) = static_type(E)
- Sample
class B inherits A;
x:A <- new B;
- Static type of
x: x:A
- Dynamic type of
x: new B
- Soundness theorem for the Cool type system: $$\forall E,\ dynamic\_type(E) \leq static\_type(E)$$
- Subclasses only add attributes or methods
- Methods can be redefined but with same type!
Self Type
Scenario (Why we need to use SELF_TYPE)
class Count {
i : int <- 0;
inc () : Count { {
i <- i + 1;
self;
}};
};
class Stock inherits Count { name : String; };
class Main { Stock a <- (new Stock).inc(); ... a.name ... };
- Because
inc() will return Count.
- Use
inc():SELF_TYPE instead.
- The type checker can now prove
- $O,M,C\vdash (new\ Count).inc():Count$
- $O,M,C\vdash (new\ Stock).inc():Stock$
- Remark:
SELF_TYPE is not a dynamic type. It is a static type. In cgen, it will fetch the class tag of the instance.
In general, if SELF_TYPE appears textually in the class $C$ as the declared type of $E$ then $dynamic\_type(E) \leq C$
The meaning of SELF_TYPE depends on where it appears
- $SELF\_TYPE_C$ to refers to an occurrence of $SELF\_TYPE$ in the body of $C$
- $SELF\_TYPE_C \leq C$
Type Operation
- $T_1 \leq T_2$: $T_1$ is a subtype of $T_2$
- $lub(T_1, T_2)$: the least-upper bound of $T_1$ and $T_2$
Let $T$ and $T’$ be any types but SELF_TYPE. Theorems between two types:
- $SELF\_TYPE_C \leq SELF\_TYPE_C$
- $lub(SELF\_TYPE_C, SELF\_TYPE_C) = SELF\_TYPE_C$
- $SELF\_TYPE_C \leq T$, if $C\leq T$
- $SELF\_TYPE_C$ can be any subtype of $C$
- Include $C$ itself.
- $lub(SELF\_TYPE_C, T) = lub(C, T)$
- $T \leq SELF\_TYPE_C$ always false.
- Special case: $SELF\_TYPE_C$ is a path and $T$ is the leaf of the path.
- $lub(SELF\_TYPE_C, T) = lub(C, T)$
- $T \leq T'$
Self Type Usage
SELF_TYPE is not allowed everywhere a type can appear.
class T inherits T’ {...}: T and T' cannot be SELF_TYPE
x:T: T can be SELF_TYPE
let x: T in E: T can be SELF_TYPE
new T: T can be SELF_TYPE
m@T(E_1, ..., E_n): T cannot be SELF_TYPE
m(x : T) : T’ { ... }: Only T’ can be SELF_TYPE
Self Type Checking
The meaning of SELF_TYPE depends on the enclosing class $C$ $(O,M,C\vdash e:T)$.
We use lub to design type rules using SELF_TYPE
- [Dispatch] $$\frac{O,M,C\vdash e_0:T_0\\ O,M,C\vdash e_1:T_1\\ \dots\\ O,M,C\vdash e_n:T_n\\ M(T_0, f)=(T'_{1},\dots,T'_{n'},T'_{n+1})\\ T'_{n+1} \neq SELF\_TYPE\\ T_i\leq T'_{i},\ for\ 1\leq i\leq n}{O,M,C\vdash e_0.f(e_1,\dots,e_n):T'_{n+1}}$$
- [Static Dispatch] $$\frac{O,M,C\vdash e_0:T_0\\ O,M,C\vdash e_1:T_1\\ \dots\\ O,M,C\vdash e_n:T_n\\ M(T_0, f)=(T'_{1},\dots,T'_{n'},T'_{n+1})\\ T'_{n+1} \neq SELF\_TYPE\\ T_i\leq T'_{i},\ for\ 1\leq i\leq n}{O,M,C\vdash e_0@T.f(e_1,\dots,e_n):T'_{n+1}}$$
New rules using SELF_TYPE
- $$\frac{}{O,M,C\vdash self: SELF\_TYPE_C} $$
- $$\frac{}{O,M,C\vdash new\ SELF\_TYPE: SELF\_TYPE_C} $$
Error Recovery in Semantic
Problem
- What type is assigned to an expression with no legitimate type?
- This type will influence the typing of the enclosing expression.
Solution
- Assign type
Object to ill-typed expressions.
- Or assign a new type
No_type for use with ill-typed expressions
No_type $\leq$ C, for all types $C$
- Every operation is defined for
No_type
- The class hierarchy is not a tree anymore.
- The
Object solution is fine in the course project.
Runtime Organization
Introduction
Goal
- Management of Run-time Resources
- Correspondence between static (compile-time) and dynamic (run-time) structures
- Storage organization
When a program is invoked
- The OS allocates space for the program
- The code is loaded into part of the space
- The OS jumps to the entry point (i.e., “main”)
Activations
Goal
- Correctness of the program
- Speed of the program
Assumption
- Execution is sequential; control moves from one point in a program to another in a well-defined order.
- When a procedure is called, control always returns to the point immediately after the call.
Activation
- An invocation of procedure $P$ is an activation of $P$.
- The lifetime of an activation of $P$ is
- (Run-time) All the steps to execute $P$
- (Nested) Including all the steps in procedures $P$ calls
- The lifetime of a variable
x is the portion of execution in which x is defined.
- Lifetime is a dynamic (run-time) concept.
- Scope is a static concept.
- Lifetimes of procedure activations are properly nested.
- Use a stack to track currently active procedures.
Activation Records
The information needed to manage one procedure activation is called an activation record (AR) or frame.
Scenario
A procedure $F$ calls $G$, then $G$'s activation record contains a mix of info about $F$ and $G$. $F$ is suspended until $G$ completes, at which point $F$ resumes.
$G$’s AR contains
- Complete execution of $G$
- Resume execution of $F$
- Space for $G$'s return value
- Actual parameters
- Pointer to the previous activation record (control link: points to AR of caller of $G$)
- Machine status prior to calling $G$
- Contents of registers & program counter
- Local variables
- Other temporary values
AR for function f
- (Low)
- Code
- Static Data
- Main
- f: result
- f: argument
- f: control link
- f: return address
- Empty
- Heap
- (High)
Globals & Heaps
Globals are assigned a fixed address once.
Memory Management
- Code Segment: Object code
- For many languages, fixed size and read only
- Static: data (not code) with fixed addresses (e.g., global data)
- Fixed size, may be readable or writable
- Stack: an AR for each currently active procedure
- Each AR usually fixed size, contains locals
- Heap: all other data
- In C, heap is managed by
malloc and free.
Alignment
Data is word aligned if it begins at a word boundary.
To align the 5-ch string, add 3 padding characters to the string.
Stack Machines
Goal
Use the stack to handle operating.
An instruction $r = F(a_1,\dots,a_n)$
- Pops n operands from the stack
- Computes the operation $F$ using the operands
- Pushes the result r on the stack
Location of the operands/result always on the top of the stack.
n-register Stack Machine: keep the top n locations of the pure stack machine's stack in registers
1-register stack machine: The register is called the accumulator
- Add:
acc = acc + top_of_stack
Consider an expression $op(e_1,...,e_n)$ ($e_1,...,e_n$ are subexpressions)
- For each $e_i\ (0 < i < n)$
- Compute $e_i$
- Push result on the stack
- pop n-1 values from the stack, compute op
- Store result in the accumulator